Speaker Identification Using Pseudo Pitch Synchronized Phase Information in Voiced Sound

نویسندگان

  • Kohta Shimada
  • Kazumasa Yamamoto
  • Seiichi Nakagawa
چکیده

In conventional speaker identification methods based on mel-frequency cepstral coefficients (MFCCs), phase information is ignored. Our recent studies have shown that phase information contains speaker dependent characteristics. We propose a new extraction method to extract pitch synchronous phase information from the voiced section only. Speaker identification experiments were performed using the NTT clean database and JNAS database. Using the new phase extraction method, we obtained a relative reduction in the speaker error rate of approximately 27% and 46%, respectively, for the two databases. We also obtained a relative error reduction of approximately 52% and 42%, respectively, when combining phase information with the MFCC-based method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Epoch Extraction of Voiced Speech

A general theory of epoch extraction of overlapping nonidentical waveforms is presented. The theory is applied to outputs of models of voiced speech production mechanism and to actual speech data. Some typical glottal waveshapes are considered to explain their effect on the speech output. It is shown that the points of excitation of the vocal tract can be precisely identified for continuous spe...

متن کامل

Pitch synchronized speech processing (PSSP) for speaker recognition

A method for speech signal enhancement is developed with application to automatic speaker recognition where the signals have different channel conditions. The basis of this technique is a robust pitch detection algorithm that accurately estimates the instantaneous pitch rate, and extracts single pitch period speech segments. This technique of pitch synchronized speech processing (PSSP) provides...

متن کامل

Combining pitch and MFCC for speaker identification systems

Usually, speaker recognition systems do not take into account the dependence between the vocal source and the vocal tract. A feasibility study that retains this dependence is presented here. A model of joint probability functions of the pitch and the feature vectors is proposed. Three strategies are designed and compared for all female speakers taken from the SPIDRE corpus. The first operates o...

متن کامل

Ûkio Technologinis Ir Ekonominis Vystymas Technological and Economic Development of Economy

The problem of speaker identification is investigated. Basic segments pseudo stationary intervals of voiced sounds are used for identification. The identification is carried out, comparing average distances between an investigative and comparatives. The coefficients of the linear prediction model (LPC) of a vocal tract are used as features of identification. Such a problem arises in stenographi...

متن کامل

New Refinement Schemes for Voice Conversion

New refinement schemes for voice conversion are proposed in this paper. We take mel-frequency cepstral coefficients (MFCC) as the basic feature and adopt cepstral mean subtraction to compensate the channel effects. We propose S/U/V (Silence/Unvoiced/Voiced) decision rule such that two sets of codebooks are used to capture the difference between unvoiced and voiced segments of the source speaker...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011